SWE-smith

Scaling Data for Software Engineering Agents

April 30, 2025

Creating training data for software engineering agents is difficult. Until now.

Introducing SWE-smith: Generate 100s to 1000s of task instances for any GitHub repository.

We've generated 50k+ task instances for 128 popular GitHub repositories, then trained our own LM for SWE-agent.

The result? SWE-agent-LM-32B achieve 40% pass@1 on SWE-bench Verified.

Now, we've open-sourced everything, and we're excited to see what you build with it!

Check out the tutorial below to generate 100 task instances for any GitHub repository in 10 minutes.

Click here for an extended discussion.

️🔥 Excited about SWE-smith? Build with us!

> Create new bug generation techinques.

> Expand to non-Python repositories.

> Train better SWE-agents!

Read our documentation or code for more.

Authors

John Yang, Kilian Lieret, Carlos E. Jimenez, Alexander Wettig, Kabir Khandpur, Yanzhe Zhang, Binyuan Hui, Ofir Press, Ludwig Schmidt, Diyi Yang

Affiliations

Stanford University, Stanford SALT Lab, Princeton Language & Intelligence, Alibaba Qwen

Citation

@misc{yang2025swesmith, title={SWE-smith: Scaling Data for Software Engineering Agents}, author={John Yang and Kilian Leret and Carlos E. Jimenez and Alexander Wettig and Kabir Khandpur and Yanzhe Zhang and Binyuan Hui and Ofir Press and Ludwig Schmidt and Diyi Yang}, year={2025}, eprint={2504.21798}, archivePrefix={arXiv}, primaryClass={cs.SE}, url={https://arxiv.org/abs/2504.21798}, }